Non-Markovian Policies in Sequential Decision Problems
نویسنده
چکیده
In this article we prove the validity of the Dellman Optimality Equa tion a.nd related results for sequential decision problems with a general recursive structure. The characteristic feature of our approach is that also non-Markovian policies are taken into account. The theory is moti vated by some experiments with a learning robot.
منابع مشابه
Non-Deterministic Policies in Markovian Decision Processes
Markovian processes have long been used to model stochastic environments. Reinforcement learning has emerged as a framework to solve sequential planning and decision-making problems in such environments. In recent years, attempts were made to apply methods from reinforcement learning to construct decision support systems for action selection in Markovian environments. Although conventional meth...
متن کاملNon-Deterministic Policies In Markovian Processes
Markovian processes have long been used to model stochastic environments. Reinforcement learning has emerged as a framework to solve sequential planning and decision making problems in such environments. In recent years, attempts were made to apply methods from reinforcement learning to construct adaptive treatment strategies, where a sequence of individualized treatments is learned from clinic...
متن کاملNon-Markovian Control with Gated End-to-End Memory Policy Networks
Partially observable environments present an important open challenge in the domain of sequential control learning with delayed rewards. Despite numerous attempts during the two last decades, the majority of reinforcement learning algorithms and associated approximate models, applied to this context, still assume Markovian state transitions. In this paper, we explore the use of a recently propo...
متن کاملEfficient Policies for Stationary Possibilistic Markov Decision Processes
Possibilistic Markov Decision Processes offer a compact and tractable way to represent and solve problems of sequential decision under qualitative uncertainty. Even though appealing for its ability to handle qualitative problems, this model suffers from the drowning effect that is inherent to possibilistic decision theory. The present paper proposes to escape the drowning effect by extending to...
متن کاملCompactness of the space of non-randomized policies in countable-state sequential decision processes
For sequential decision processes with countable state spaces, we prove compactness of the set of strategic measures corresponding to nonrandomized policies. For the Borel state case, this set may not be compact [14, p. 170] in spite of compactness of the set of strategic measures corresponding to all policies [17,2]. We use the compactness result from this paper to show the existence of optima...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Acta Cybern.
دوره 13 شماره
صفحات -
تاریخ انتشار 1998